Search CORE

61 research outputs found

Protein subfamily assignment using the Conserved Domain Database

Author: Fong Jessica H
Marchler-Bauer Aron
Publication venue: BioMed Central
Publication date: 01/01/2008
Field of study

CiteSeerX

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

AlexSys: a knowledge-based expert system for multiple sequence alignment construction and analysis

Author: Aniba Mohamed Radhouene
Marchler-Bauer Aron
Poch Olivier
Thompson Julie Dawn
Publication venue: Oxford University Press
Publication date: 01/01/2010
Field of study

Multiple sequence alignment (MSA) is a cornerstone of modern molecular biology and represents a unique means of investigating the patterns of conservation and diversity in complex biological systems. Many different algorithms have been developed to construct MSAs, but previous studies have shown that no single aligner consistently outperforms the rest. This has led to the development of a number of ‘meta-methods’ that systematically run several aligners and merge the output into one single solution. Although these methods generally produce more accurate alignments, they are inefficient because all the aligners need to be run first and the choice of the best solution is made a posteriori. Here, we describe the development of a new expert system, AlexSys, for the multiple alignment of protein sequences. AlexSys incorporates an intelligent inference engine to automatically select an appropriate aligner a priori, depending only on the nature of the input sequences. The inference engine was trained on a large set of reference multiple alignments, using a novel machine learning approach. Applying AlexSys to a test set of 178 alignments, we show that the expert system represents a good compromise between alignment quality and running time, making it suitable for high throughput projects. AlexSys is freely available from http://alnitak.u-strasbg.fr/∼aniba/alexsys

CiteSeerX

HAL-Inserm

PubMed Central

Inferred Biomolecular Interaction Server—a web server to analyze and predict protein interacting partners and binding sites

Author: Anna R. Panchenko
Aron Marchler-Bauer
Atwell
Benjamin A. Shoemaker
Bork
Brylinski
Campbell
Chen
Chen
Dachuan Zhang
Gerlt
Gibrat
Giot
Hegyi
Hernandez
Huang
Jessica H. Fong
Jones
Krissinel
Landgraf
Laurie
Li
Manoj Tyagi
Marchler-Bauer
Marchler-Bauer
Matthews
Pazos
Qin
Ratna R. Thangudu
Rentzsch
Shoemaker
Slonim
Snyder
Stein
Stephen H. Bryant
Sussman
Talavera
Teichmann
Thomas Madej
Wang
Wang
Wang
Yu
Publication venue: Oxford University Press
Publication date
Field of study

IBIS is the NCBI Inferred Biomolecular Interaction Server. This server organizes, analyzes and predicts interaction partners and locations of binding sites in proteins. IBIS provides annotations for different types of binding partners (protein, chemical, nucleic acid and peptides), and facilitates the mapping of a comprehensive biomolecular interaction network for a given protein query. IBIS reports interactions observed in experimentally determined structural complexes of a given protein, and at the same time IBIS infers binding sites/interacting partners by inspecting protein complexes formed by homologous proteins. Similar binding sites are clustered together based on their sequence and structure conservation. To emphasize biologically relevant binding sites, several algorithms are used for verification in terms of evolutionary conservation, biological importance of binding partners, size and stability of interfaces, as well as evidence from the published literature. IBIS is updated regularly and is freely accessible via http://www.ncbi.nlm.nih.gov/Structure/ibis/ibis.html

Crossref

PubMed Central

MMDB: annotating protein sequences with Entrez's 3D-structure database

Author: Addess Kenneth J.
Bryant Stephen H.
Chen Jie
Geer Lewis Y.
He Jane
He Siqian
Lu Shennan
Madej Thomas
Marchler-Bauer Aron
Thiessen Paul A.
Wang Yanli
Zhang Naigong
Publication venue: Oxford University Press
Publication date: 29/11/2006
Field of study

Three-dimensional (3D) structure is now known for a large fraction of all protein families. Thus, it has become rather likely that one will find a homolog with known 3D structure when searching a sequence database with an arbitrary query sequence. Depending on the extent of similarity, such neighbor relationships may allow one to infer biological function and to identify functional sites such as binding motifs or catalytic centers. Entrez's 3D-structure database, the Molecular Modeling Database (MMDB), provides easy access to the richness of 3D structure data and its large potential for functional annotation. Entrez's search engine offers several tools to assist biologist users: (i) links between databases, such as between protein sequences and structures, (ii) pre-computed sequence and structure neighbors, (iii) visualization of structure and sequence/structure alignment. Here, we describe an annotation service that combines some of these tools automatically, Entrez's ‘Related Structure’ links. For all proteins in Entrez, similar sequences with known 3D structure are detected by BLAST and alignments are recorded. The ‘Related Structure’ service summarizes this information and presents 3D views mapping sequence residues onto all 3D structures available in MMDB ()

Crossref

PubMed Central

CDD: a Conserved Domain Database for protein classification

Author: Anderson John B.
Bryant Stephen H.
Cherukuri Praveen F.
DeWeese-Scott Carol
Geer Lewis Y.
Gwadz Marc
He Siqian
Hurwitz David I.
Jackson John D.
Ke Zhaoxi
Lanczycki Christopher J.
Liebert Cynthia A.
Liu Chunlei
Lu Fu
Marchler Gabriele H.
Marchler-Bauer Aron
Mullokandov Mikhail
Shoemaker Benjamin A.
Simonyan Vahan
Song James S.
Thiessen Paul A.
Yamashita Roxanne A.
Yin Jodie J.
Zhang Dachuan
Publication venue: Oxford University Press
Publication date: 17/12/2004
Field of study

The Conserved Domain Database (CDD) is the protein classification component of NCBI's Entrez query and retrieval system. CDD is linked to other Entrez databases such as Proteins, Taxonomy and PubMed®, and can be accessed at http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=cdd. CD-Search, which is available at http://www.ncbi.nlm.nih.gov/Structure/cdd/wrpsb.cgi, is a fast, interactive tool to identify conserved domains in new protein sequences. CD-Search results for protein sequences in Entrez are pre-computed to provide links between proteins and domain models, and computational annotation visible upon request. Protein–protein queries submitted to NCBI's BLAST search service at http://www.ncbi.nlm.nih.gov/BLAST are scanned for the presence of conserved domains by default. While CDD started out as essentially a mirror of publicly available domain alignment collections, such as SMART, Pfam and COG, we have continued an effort to update, and in some cases replace these models with domain hierarchies curated at the NCBI. Here, we report on the progress of the curation effort and associated improvements in the functionality of the CDD information retrieval system

CiteSeerX

Crossref

PubMed Central

InterPro in 2022.

Author: Bateman Alex
Bileschi Maxwell L
Blum Matthias
Bork Peer
Bridge Alan
Chuguransky Sara
Colwell Lucy
Gough Julian
Grego Tiago
Haft Daniel H
Letunić Ivica
Marchler-Bauer Aron
Mi Huaiyu
Natale Darren A
Orengo Christine A
Pandurangan Arun P
Paysan-Lafosse Typhaine
Pinto Beatriz Lázaro
Rivoire Catherine
Salazar Gustavo A
Sigrist Christian JA
Sillitoe Ian
Thanki Narmada
Thomas Paul D
Tosatto Silvio CE
Wu Cathy H
Publication venue: 'Oxford University Press (OUP)'
Publication date: 09/11/2022
Field of study

The InterPro database (https://www.ebi.ac.uk/interpro/) provides an integrative classification of protein sequences into families, and identifies functionally important domains and conserved sites. Here, we report recent developments with InterPro (version 90.0) and its associated software, including updates to data content and to the website. These developments extend and enrich the information provided by InterPro, and provide a more user friendly access to the data. Additionally, we have worked on adding Pfam website features to the InterPro website, as the Pfam website will be retired in late 2022. We also show that InterPro's sequence coverage has kept pace with the growth of UniProtKB. Moreover, we report the development of a card game as a method of engaging the non-scientific community. Finally, we discuss the benefits and challenges brought by the use of artificial intelligence for protein structure prediction

UCL Discovery

Database resources of the National Center for Biotechnology Information

Author: Alexandre Souvorov
Altschul
Altschul
Amberger
Anna Panchenko
Aron Marchler-Bauer
Barrett
Benson
Berman
Blumenfeld
Brazma
Crosby
David J. Lipman
David Landsman
Deanna M. Church
Dennis A. Benson
Donna R. Maglott
Douglas Slotta
Edwin Sequeira
Eppig
Eric W. Sayers
Eugene Yaschenko
Evan Bolton
Finn
Fu
Geer
Geschwind
Ghedin
Gibrat
Gong
Gregory D. Schuler
Grigory Starchenko
Haft
Heintz
Helmberg
Hong
Ilene Mizrachi
James Ostell
Ji
Jian Ye
Kanehisa
Kanehisa
Kanehisa
Kapustin
Karl Sirotkin
Kathi Canese
Keseler
Kim D. Pruitt
Klimke
Knutsen
Lenffer
Letunic
Lewis Y. Geer
Lukas Wagner
Ma
Madej
Maglott
Manolio
Marchler-Bauer
Martin Shumway
Michael DiCuccio
Michael Feolo
Mitelman
Needleman
Pagon
Papadopoulos
Pruitt
Schuler
Schuler
Scott Federhen
Sequeira
Sewell
Sherry
Shumway
Sprague
Stephen H. Bryant
Stephen T. Sherry
Tanya Barrett
Tatiana A. Tatusova
Tatusov
Tatusova
Thomas L. Madden
Tom Madej
Vadim Miller
Vyacheslav Chetvernin
W. John Wilbur
Waggoner
Wang
Wang
Wang
Whetzel
Wolfgang Helmberg
Yanli Wang
Ye
Yuri Kapustin
Zhang
Zhiyong Lu
Publication venue: Oxford University Press
Publication date: 01/01/2010
Field of study

In addition to maintaining the GenBank® nucleic acid sequence database, the National Center for Biotechnology Information (NCBI) provides analysis and retrieval resources for the data in GenBank and other biological data made available through the NCBI web site. NCBI resources include Entrez, the Entrez Programming Utilities, MyNCBI, PubMed, PubMed Central, Entrez Gene, the NCBI Taxonomy Browser, BLAST, BLAST Link (BLink), Electronic PCR, OrfFinder, Spidey, Splign, Reference Sequence, UniGene, HomoloGene, ProtEST, dbMHC, dbSNP, Cancer Chromosomes, Entrez Genomes and related tools, the Map Viewer, Model Maker, Evidence Viewer, Trace Archive, Sequence Read Archive, Retroviral Genotyping Tools, HIV-1/Human Protein Interaction Database, Gene Expression Omnibus, Entrez Probe, GENSAT, Online Mendelian Inheritance in Man, Online Mendelian Inheritance in Animals, the Molecular Modeling Database, the Conserved Domain Database, the Conserved Domain Architecture Retrieval Tool, Biosystems, Peptidome, Protein Clusters and the PubChem suite of small molecule databases. Augmenting many of the web applications are custom implementations of the BLAST program optimized to search specialized data sets. All these resources can be accessed through the NCBI home page at www.ncbi.nlm.nih.gov

CiteSeerX

Crossref

PubMed Central